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In  summary,  this  project  showed  that  it  is  possible  to  model  safety  constraints  and  to  use 
the  KIDS  constraint  propagation  tactic  to  generate  efficient  schedulers  that  create  safe-by¬ 
construction  maintenance  plans  for  power  plants.  However,  when  we  tried  to  obtain  further 
support  from  the  utility  industry,  we  faced  the  problem  that  (1)  the  market  is  small  (a 
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Abstract 

We  describe  the  synthesis  of  efficient  schedulers  for 
planned  shutdowns  of  power  plants  for  refueling  and 
maintenance  (outages),  using  an  automated  program¬ 
ming  tool,  KIDS .  Currently,  the  utility  industry  has 
no  automated  tools  to  generate  schedules  that  are  both 
safe  and  resource-efficient .  We  focused  on  safety  con¬ 
straints  since  they  are  critical  in  this  application.  There 
are  several  aspects  of  this  project  that  go  beyond  previ¬ 
ous  applications  of  KIDS  to  scheduling  problems.  First , 
scheduling  of  outages  of  power  plants  has  a  planning¬ 
like  character  since  the  scheduler  needs  to  represent 
and  maintain  the  complex  state  of  the  plant  at  all 
times  considered  during  the  scheduling  process .  Second, 
the  particular  safety  constraints  we  considered  required 
scheduling  a  pool  of  resources  in  the  presence  of  time 
windows  on  each  activity.  To  our  knowledge  the  control 
and  data  structures  that  we  developed  for  handling  such 
a  pool  are  novel .  In  terms  of  design  knowledge,  the 
outage  scheduling  problem  is  modeled  as  a  constraint 
satisfaction  problem  and  the  synthesized  algorithm  is 
an  instance  of  global  search  with  constraint  propaga¬ 
tion.  The  derivation  of  specialized  representations  for 
the  constraints  to  perform  efficient  propagation  is  a  key 
aspect  of  our  approach.  In  addition ,  finite  differenc¬ 
ing  complements  constraint  propagation  by  efficiently 
maintaining  the  state  of  the  world. 


1,  Introduction 

Planning  and  scheduling  tasks  are  inherently  com¬ 
plex.  In  computational  terms,  they  are  intractable ,  i.e., 
NP-hard  or  worse.  As  a  practical  consequence,  realistic 
size  planning  and  scheduling  problems  cannot  be  solved 
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optimally  in  a  “reasonable”  amount  of  time.  Nonethe¬ 
less,  solutions  have  to  be  found  for  real-world  problems, 
and  therefore  heuristic  approaches  have  to  be  adopted, 
ideally  with  some  guarantee  on  the  quality  of  the  solu¬ 
tion. 

Our  approach  emphasizes  the  fast  generation  of 
schedules  to  cope  with  large  domains.  We  use  a  rich 
representation  for  the  state  of  the  world  at  any  time 
(as  in  planning  approaches)  which  allows  efficient  con¬ 
straint  reasoning,  temporal  reasoning  in  particular  (as 
in  scheduling).  The  problem  is  modeled  as  a  constraint 
satisfaction  problem  combining  a  global  search  tactic 
with  constraint  propagation .  The  derivation  of  very 
specialized  representations  for  the  constraints  to  per¬ 
form  efficient  propagation  is  a  key  aspect  of  our  ap¬ 
proach.  Furthermore,  finite  differencing  complements 
constraint  propagation  by  efficiently  maintaining  the 
state  of  the  world,  allowing  incremental  computations. 
In  our  approach,  constraints  are  compiled  into  the  code 
-  this  is  a  novel  aspect  of  our  work  using  an  automatic 
programming  system,  KIDS  (Kestrel  Interactive  De¬ 
velopment  System)  [13].  Another  novel  aspect  of  our 
approach  is  the  generation  of  schedules  that  are  feasi¬ 
ble  over  time  windows  rather  than  having  single  time 
points  as  start  times,  which  increases  schedule  robust¬ 
ness. 

We  describe  the  application  of  our  approach  to  the 
real-world  problem  of  multiple  resource-constrained 
project  management.  This  problem  is  very  common 
in  manufacturing  and  it  is  a  generalization  of  the  well- 
known  job-shop  scheduling  problem  [3,  17].  As  a  par¬ 
ticular  instance  of  this  problem,  we  consider  the  man¬ 
agement  of  outages  of  power  plants.1  An  outage  is 
a  planned  shutdown  for  refueling,  repair,  and  mainte¬ 
nance.  It  is  a  rather  daunting  real-world  task  that  may 

^ome  Laboratory  has  a  project  in  collaboration  with  Electric 
Power  Research  Institute  (  EPRI),  Kaman  Science,  and  Kestrel 
Institute  to  evaluate  the  use  of  advanced  AI/OR  planning  and 
scheduling  technology  for  outage  management. 
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involve  from  5,000  up  to  45,000  activities.  Further¬ 
more,  in  this  domain,  the  existence  of  good  automatic 
solutions  is  not  only  crucial  for  safety  reasons  but  also 
for  economic  reasons  —  the  cost  per  day  of  shutdown 
is  in  the  order  of  $1,000,000. 

There  are  several  aspects  of  this  project  that  go  be¬ 
yond  previous  applications  of  KIDS  to  scheduling  prob¬ 
lems.  First,  scheduling  of  outages  of  power  plants  has 
a  planning-like  character  since  the  scheduler  needs  to 
represent  and  maintain  the  complex  state  of  the  plant 
at  all  times  considered  during  the  scheduling  process. 
Second,  the  particular  safety  constraints  we  considered 
required  scheduling  a  pool  of  resources  in  the  presence 
of  time  windows  on  each  activity.  To  our  knowledge 
the  control  and  data  structures  that  we  developed  for 
handling  such  a  pool  are  novel. 

In  the  next  section  we  discuss  related  work.  In  sec¬ 
tion  3  we  describe  the  outage  problem.  Section  4  de¬ 
scribes  our  approach  to  schedule  synthesis.  In  section 
5  we  present  performance  results.  We  draw  conclusions 
and  discuss  future  work  in  section  6. 

2.  Related  Work 

Traditionally  in  AI,  planning  and  scheduling  are 
considered  as  two  distinct  phases  of  project  manage¬ 
ment.  Planning  determines  how  to  achieve  a  set  of 
goals  by  performing  a  set  of  actions  in  a  given  (par¬ 
tial)  order,  and  given  an  initial  state  of  the  world, 
without  taking  into  consideration  resource  constraints. 
Scheduling,  on  the  other  hand,  considers  a  more  oper¬ 
ational  perspective:  actions  have  times  and  resources 
assigned  to  them. 

Current  ai  planners,  e.g.,  sipe-2  [19,  20]  and  O-PLAN 
[16],  have  very  limited  temporal  and  resource  reasoning 
capabilities.  In  order  to  overcome  this  limitation  and  to 
improve  the  quality  of  the  generated  plans,  for  instance, 
SIPE-2  was  integrated  with  TACHYON  [1],  a  temporal 
reasoner,  and  with  the  capacity  analysis  module  of  OPIS 
(ditops),  a  scheduling  system  developed  at  CMU([14]). 
These  experiments  are  reported  in  [2].  Oplan’s  con¬ 
ceptual  model  includes  constraint  managers  sharing  a 
common  constraint  representation.  For  example,  re¬ 
source  utilization  is  handled  by  one  manager,  tempo¬ 
ral  reasoning  by  another,  and  so  on.  Tate  ([15])  sug¬ 
gests  replacing  o-PLAN’s  simple  time  constraint  man¬ 
ager  with  a  better  temporal  system  —  an  approach 
similar  to  the  integration  of  SIPE-2  and  TACHYON.  Tate 
also  suggests  adding  other  types  of  constraint  man¬ 
agers,  such  as  spatial  reasoning,  through  a  modular 
interface  with  the  planner.  Although  this  type  of  in¬ 
tegration  adds  capabilities  to  the  planners  (e.g.,  tem¬ 
poral,  resource  or  spatial  reasoning),  its  modular  na¬ 


ture  (plan  critics  in  SIPE-2  or  constraint  managers  in 
Oplan)  is  not  computationally  efficient. 

The  separation  of  planning  and  scheduling  does  not 
reflect  operational  reality  —  it  is  an  artifact  of  current 
approaches  of  automated  planners  and  schedulers.  A 
plan  should  not  be  feasible  without  having  a  feasible 
schedule  instantiation  because  of  resource  constraints. 
Therefore,  the  integration  of  planning  and  scheduling 
seems  natural.  Moreover,  temporal  and  resource  con¬ 
straints  from  the  scheduling  problem  can  be  used  in 
the  planning  phase  to  prune  large  parts  of  the  search 
space. 

AI  schedulers,  on  the  other  hand,  incorporate  tem¬ 
poral  and  resource  reasoning  techniques  (e.g.,  [14],  [10] 
[7]).  However,  scheduling  approaches  typically  do  not 
model  the  state  of  the  world  (as  in  planning). 

Some  researchers  have  addressed  the  integration  of 
planning  and  scheduling.  For  instance,  HSTS  ([8])  is 
a  framework  unifying  planning  and  scheduling  with  a 
rich  domain  representation  allowing  constraint  prop¬ 
agation.  The  main  limitation  of  such  approaches  is 
the  lack  of  guarantee  that  fast  good  quality  solutions 
are  produced.  The  collection  Intelligent  Scheduling ,  in¬ 
cludes  several  AI  planners  and  schedulers  as  well  as  sys¬ 
tems  that  integrate  planning  and  scheduling  [21]. 

The  main  innovation  of  our  approach  compared  to 
other  planning  and  scheduling  approaches  is  the  deriva¬ 
tion  of  very  specialized  constraints  that  are  compiled 
into  the  search  and  control  mechanisms  [5,  6].  Existing 
planning  and  scheduling  approaches  use  constraint  rep¬ 
resentations  and  operations  that  are  geared  for  a  broad 
class  of  problems.  Our  approach,  derives  specialized 
representations  for  constraints  allowing  fast  constraint 
checking  and  constraint  propagation. 

Another  novel  aspect  of  our  approach  is  the  gen¬ 
eration  of  schedules  that  have  time  windows  as  start 
times,  which  adds  robustness  to  the  solution.  Existing 
AI  approaches  to  scheduling  with  resource  constraints 
only  generate  a  single  solution,  feasible  for  single  start 
times,  without  any  guarantees  of  feasibility  over  time 
windows.  With  our  approach,  we  generate  an  infinite 
family  of  feasible  schedules. 

The  framework  selected  for  this  project  was  KIDS 
(Kestrel  Interactive  Development  System)  [13],  which 
supports  users  in  transforming  declarative  problem 
specifications  into  correct  and  efficient  programs.  The 
transformations  provided  in  KIDS  are  designed  to  per¬ 
form  significant  and  meaningful  actions  in  terms  of 
search  efficiency.  The  various  transformations  in  KIDS 
include:  algorithmic  transformations,  program  opti¬ 
mization  techniques  and  data  structures  refinement. 
The  algorithmic  transformations  allow  the  user  to  add 
search  and  control  mechanisms  to  a  given  problem 
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specification.  Finite  differencing  is  another  important 
transformation  provided  by  KIDS.  KIDS  uses  a  form  of 
deductive  inference  called  directed  inference  to  reason 
about  the  problem  specification  in  order  to  automati¬ 
cally  apply  tactics,  derive  filters  and  perform  constraint 
propagation.  KIDS  has  been  used  to  derive  fast  and  ac¬ 
curate  transportation  schedulers  from  formal  specifica¬ 
tions  on  large-scale  transportation  planning  problems. 
A  typical  transportation  problem  with  10,000  move¬ 
ment  requirements  takes  the  derived  scheduler  1  to  3 
minutes  to  solve,  compared  with  2.5  hours  for  a  de¬ 
ployed  feasibility  estimator  (JFAST)  and  36  hours  for 
deployed  schedulers  (FLOGEN,  ADANS).  The  com¬ 
puted  schedules  use  relatively  few  resources  and  satisfy 
all  specified  constraints  [1 1] . 

In  this  paper  we  show  how  previous  applications  of 
KIDS  to  scheduling  problems  can  be  extended  to  tackle 
a  much  richer  real-world  scheduling  task  with  planning¬ 
like  features  involving  complex  safety  constraints  and 
resource  constraints  in  the  presence  of  time  windows  on 
each  activity,  a  novel  aspect  of  the  approach  reported 
here. 

3.  Management  of  Outages  of  Power 
Plants 

The  planning  and  scheduling  of  outages  of  power 
plants  have  a  great  impact  in  terms  of  the  outage  costs 
(replacement  power,  labor  cost,  etc.),  use  of  scarce  re¬ 
sources  and  implementation  of  safety  procedures.  Dur¬ 
ing  an  outage  several  activities  are  performed,  such 
as  refueling  operations,  plant  betterment,  preventive 
maintenance,  corrective  maintenance,  and  technical 
specification  requirements  for  inspections  or  surveil¬ 
lance  [9, 18].  Depending  on  the  particular  plant,  as  well 
as  the  scope  of  the  activities  performed  during  the  out¬ 
age,  the  planning  and  scheduling  of  outages  for  power 
plants  might  involve  from  5, 000  up  to  45, 000  activities. 
Explicit  precedence  constraints  between  activities  are 
defined  in  work  order  activities.  Other  constraints  be¬ 
tween  activities  arise  as  a  result  of  different  technolog¬ 
ical  constraints.  The  general  principle  underlying  the 
outage  procedures  is  that  outages  should  be  as  short 
as  possible,  maintaining  the  appropriate  level  of  safety. 
The  main  safety  functions  that  are  monitored  during 
an  outage  are:  AC  power  control  system,  primary  and 
secondary  containment,  fuel  pool  cooling  system,  in¬ 
ventory  control,  reactivity  control,  shutdown  cooling, 
and  vital  support  systems.  In  this  paper,  we  describe 
the  generation  of  schedules  for  outages  of  power  plants 
enforcing  safety  conditions  regarding  AC  power  con¬ 
trol. 

The  current  scheduling  software  tools  used  by  the 


utilities  are  very  simple  -  planning  and  scheduling  still 
heavily  rely  on  the  experience  of  the  manual  sched¬ 
ulers,  rather  than  on  automatic  procedures.  Current 
automatic  approaches  to  outage  scheduling  mainly  con¬ 
sist  of  the  application  of  automatic  project  manage¬ 
ment  techniques,  such  as  PERT  and  CPM  techniques, 
which  only  handle  simple  before/after  precedence  re¬ 
lationships  between  activities  without  taking  into  con¬ 
sideration  resource  constraints  and  safety  constraints. 
Safety  and  risk  assessment  is  usually  a  manual  process 
which  calls  on  the  expertise  of  the  personnel  involved 
to  make  decisions  based  on  published  policies  and  pro¬ 
cedures.  In  order  to  ensure  that  the  sequence  of  activ¬ 
ities  performed  during  an  outage  follows  the  safety  re¬ 
quirements,  the  utilities  perform  the  risk  assessment  of 
the  schedules  using  the  software  ORAM  (Outage  Risk 
Assessment  Methodology).  ORAM,  an  EPRI  tool  for 
risk  assessment,  simulates  the  execution  of  the  sched¬ 
ule  keeping  track  of  the  configuration  of  the  plant  at 
any  time.  If  the  schedules  do  not  satisfy  the  safety  con¬ 
straints,  manual  adjustments  have  to  be  performed  in 
order  to  meet  the  safety  requirements. 

4.  Schedule  Synthesis 

Our  approach  to  planning  and  scheduling  combines 
a  constraint  satisfaction  paradigm  with  a  global  search 
tactic  with  constraint  propagation.  We  developed  a 
prototype  for  the  domain  of  outages  of  power  plants, 
ROMAN  (Rome  Lab  Outage  MANager).  ROMAN  in¬ 
cludes  all  the  technological  constraints  currently  incor¬ 
porated  in  the  automatic  tools  used  by  the  utilities  for 
schedule  generation.  In  addition,  it  includes  all  the 
constraints  regarding  the  safety  function  AC  power. 
Other  safety  functions  could  be  modeled  in  a  simi¬ 
lar  way.  A  top  level  formal  specification  of  the  out¬ 
age  problem  including  the  safety  function  AC  power 
follows:2 

function  :  safe- out  age- windows  (activities) 
returns(schedule  | 

Consistent- Activity-Separation($chedule)  A 

Consistent- AC-power(schedule)  A 

All-activitie$-scheduled(activities ,  schedule)) 

In  this  formulation  activities  correspond  to  the 
set  of  activities  to  be  performed.  Each  activity  has 
a  given  duration,  a  set  of  predecessors,  and  a  set 
of  effects  on  resources.  The  schedule  is  a  partial 
order  of  activities.  Activities  in  the  schedule  have 

2  We  modeled  the  AC  power  safety  function  as  a  proof  of  con¬ 
cept.  Other  safety  functions  could  be  modeled  in  a  similar  way. 
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time  windows  assigned  to  it.  A  time  window  defines 
the  earliest  start  time  ( est )  and  latest  start  time 
(1st)  of  an  activity,  such  that  the  activity  can  start 
at  any  time  during  the  window  without  increasing 
the  overall  duration  of  the  project.  Given  the  du¬ 
ration  of  the  activity,  the  earliest  finish  time  (eft) 
and  latest  finish  time  ( Ift )  can  be  calculated.  The 
predicate  Consistent- Activity- Separation(schedule) 
states  that  all  the  activities  in  the  schedule  sat¬ 
isfy  the  precedence  constraints.  The  predicate 
Consistent- ac-power (schedule)  states  that  the  sched¬ 
ule  verifies  the  safety  constraints,  from  an  AC  power 
point  of  view.  As  a  completeness  condition,  the 
predicate  All-activities-scheduled(activities ,  schedule) 
states  that  all  the  activities  have  to  be  scheduled. 

The  notion  of  state  of  the  plant  is  a  key  concept  in 
enforcing  safety  constraints.  In  outage  management, 
the  state  of  the  plant  is  measured  in  colors  —  green, 
yellow,  orange  or  red,  in  this  order  of  increasing  risk. 
Figure  1  illustrates  the  types  of  decision  trees  regard¬ 
ing  safety  levels  for  the  case  of  AC  power.  Basically, 
the  AC  power  safety  constraint  states  the  conditions  in 
terms  of  availability  of  AC  power  resources  and  types 
of  activities  being  executed,  for  which  the  state  of  the 
plant  is  safe  (green  or  yellow).  For  instance,  if  there 
is  an  activity  being  executed  that  has  the  potential  to 
cause  AC  power  loss,  then  in  order  for  the  plant  to  be 
in  a  yellow  state  it  is  required  to  have  two  off-site  AC 
power  sources  available  and  three  operable  emergency 
safeguard  buses. 

Since  the  start  times  of  activities  are  defined  over 
time  windows,  we  introduce  two  concepts  regarding 
the  execution  of  an  activity:  the  definite  period  and 
the  potential  period  of  an  activity.  The  definite  period 
of  an  activity  corresponds  to  the  period  of  time  during 
which  the  activity  is  definitely  being  execute  —  it  is 
the  interval  of  time  between  the  latest  start  time  of  the 
activity  (1st)  and  its  earliest  finish  time  (eft).  The  po¬ 
tential  period  of  an  activity  corresponds  to  the  period 
of  time  during  which  the  activity  may  be  executed  — 
it  is  the  time  period  between  the  earliest  start  time  of 
the  activity  (es£)  and  its  latest  finish  time  (Ift).  Figure 
2  illustrates  the  notion  of  definite  period  of  an  activity. 
Notice  that  activity  A  does  not  have  a  definite  period, 
since  its  earliest  finish  time  is  before  its  latest  start 
time. 

In  addition,  we  define  two  other  concepts:  definite 
state  of  the  plant  and  potential  state  of  the  plant.  The 
definite  state  of  the  plant  is  associated  with  the  concept 
of  definite  period:  it  represents  the  state  of  the  plant 
for  a  given  safety  function  ( e.g .,  AC  power)  assuming 
that  activities  are  only  executed  during  their  definite 
period.  The  concept  of  potential  state  of  the  plant  is 
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Figure  1.  Example  of  a  decision  tree  for  the 
safety  function  AC  Power 


associated  with  the  concept  of  potential  period  of  an 
activity:  it  represents  the  state  of  the  plant  for  a  given 
safety  function  assuming  that  activities  are  executed 
during  the  whole  extension  of  their  potential  periods. 
The  potential  state  of  the  plant  is  always  “equal”  or 
“greater”  than  the  state  of  the  plant  since  the  definite 
period  of  an  activity  tends  to  underestimate  the  dura¬ 
tion  of  activities  while  the  potential  period  of  an  ac¬ 
tivity  tends  to  overestimate  the  duration  of  activities. 
Figure  3  gives  an  example.  Note  that  during  certain 
time  intervals,  the  definite  and  potential  states  of  the 
plant  coincide. 
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Figure  2.  Notion  of  a  definite  period. 
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Figure  3.  Definite  and  potential  states  of  the 
plant. 


4.1.  Search  and  Control  Mechanisms 


Figure  4.  ROMAN’S  approach 

KIDS  provides  algorithmic  transformations  that  add 
control  and  search  mechanisms  to  a  given  specification. 
The  search  tactic  selected  for  the  outage  problem  was 
global  search  (see  next  section).  Figure  4  summarizes 
the  approach  adopted  in  ROMAN. 

Initially  global  search  is  applied  to  the  formal  spec¬ 
ification  of  the  outage  problem  in  order  to  generate 
a  schedule,  assuming  the  definite  period  of  activities. 
Since  the  notion  of  definite  period  tends  to  underesti¬ 
mate  the  duration  of  the  activities,  it  is  very  likely  for 
the  schedule  produced  in  this  initial  phase  not  to  be 
feasible  from  the  point  of  view  of  the  potential  state  of 
the  plant.  In  order  to  enforce  the  safety  threshold  for 
the  potential  state  of  the  plant  at  any  time  during  the 
outage,  “refinement”  of  the  time  windows  of  the  initial 
schedule  takes  place.  In  the  next  section,  we  describe 
global  search  theory. 
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Figure  5.  Global  search  theory  for  the  Outage 
Problem 


4.2.  Global  Search  Theory 

Global  search  [12,  11]  is  a  backtrack  algorithm,  a 
refinement  of  generate-and-test.  The  tactic  is  imple¬ 
mented  by  finding  a  space  containing  all  the  solutions 
to  the  problem  that  can  be  divided  into  nested  sub¬ 
spaces.  The  global  search  algorithm  starts  with  an 
initial  set  that  contains  all  the  solutions  to  the  given 
problem  instance,  repeatedly  extracts  solutions,  splits 
sets,  and  eliminates  subsets  using  propagation,  until  no 
sets  remain  to  be  split.  The  process  can  be  described 
as  a  tree  search  in  which  a  node  represents  a  set  of 
candidates,  and  an  arc  represents  the  split  relationship 
between  a  set  and  a  subset.  The  principal  operations 
are  to  extract  candidate  solutions  from  a  set  and  to 
split  a  set  into  subsets.  The  derivation  of  efficient  cut¬ 
ting  constraints  that  eliminate  subspaces  that  do  not 
contain  any  feasible  solution  is  an  important  comple¬ 
mentary  operation  in  the  derivation  of  the  global  search 
tactic. 

Figure  5  illustrates  the  global  search  theory  for  the 
initial  scheduling  of  the  activities  considering  their  def¬ 
inite  periods.  In  this  global  search  theory  the  initial 
subspace  descriptor  (partial  schedule)  is  the  empty  se¬ 
quence  (empty  schedule).  Splitting  corresponds  to  ap¬ 
pending  an  unscheduled  activity,  with  a  given  time  win¬ 
dow,  to  the  partial  schedule.  Cutting  corresponds  to 
propagating  the  constraints  over  the  time  windows  of 
the  activities  in  the  partial  schedule.  Notice  that  cut¬ 
ting  makes  the  time  windows  shrink.  It  can  also  split  a 
time  window  as  in  the  case  of  activity  G  -  due  to  prop¬ 
agation,  activity  G’s  window  was  split  into  two.  As  we 
can  see  from  figure  5  most  of  the  work  in  this  global 
search  theory  is  performed  by  constraint  propagation. 
Splitting  corresponds  to  just  selecting  the  next  activ- 
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ity  to  schedule,  using  a  heuristic  that  favors  shorter 
schedules.3  Extraction  takes  place  when  all  the  activi¬ 
ties  have  been  scheduled. 

The  operator  extract  corresponds  to  the  second 
global  search  algorithm.  Refinement  of  time  windows 
takes  place  if  after  applying  the  initial  global  search  to 
the  outage  problem  the  potential  state  of  plant  does 
not  satisfy  the  safety  requirements.  In  other  words, 
refinement  of  time  windows  is  required  to  enforce  the 
safety  constraints  over  the  potential  period  of  all  the 
activities  in  the  initial  schedule.  This  is  achieved  by  ap¬ 
plying  a  new  global  search  to  the  formal  specification 
of  the  outage  but  using  as  input  the  schedule  generated 
in  the  initial  phase.  In  this  second  phase  the  windows 
of  the  activities  that  contribute  to  the  contention  peri¬ 
ods,  i.e.,  the  periods  in  which  the  potential  state  of  the 
plant  is  above  the  safety  threshold,  are  systematically 
reduced  until  the  potential  state  of  the  plant  becomes 
consistent  from  the  safety  point  of  view  for  all  the  times 
during  the  outage.  In  this  global  search  theory  for  the 
refinement  phase  splitting  corresponds  to  reducing  the 
size  of  the  windows  of  the  activities  involved  in  the 
contention  periods. 

4 3.  Constraint  Propagation 


Figure  6.  Cutting  Constraints 

As  pointed  out  in  the  previous  section,  one  of  the 
important  features  of  our  approach  is  the  propaga¬ 
tion  of  constraints.  Figure  6  illustrates  the  concept 
of  constraint  propagation.  Psched  represents  a  par¬ 
tial  schedule,  a  set  of  candidate  solutions,  a  node  of 
the  global  search  tree.  The  following  test  states  that  a 
partial  schedule  can  be  extended  to  a  complete  feasible 
schedule:4 

3  We  also  define  a  topological  sort  of  the  unscheduled  activ¬ 
ities  according  to  their  levels.  An  activity  has  level  0  if  it  has 
no  predecessors.  Activities  that  only  have  as  predecessors  activ¬ 
ities  of  level  0  have  level  1.  Activities  of  level  2  only  have  as 
predecessors  activities  that  have  level  0  or  X,  etc. 

4 In  the  particular  case  of  the  outage  problem,  ( sched  € 
psched)  <=>  ( domain{jf>  sched )  C  domain(sched)  A 

V(t)i  6  domain(p$ched))  =»  psched(i).est  < 

sched(i).st  <  psched(i) .1  st)  and  feasible(sched,  activities)  •<=> 
(consistent- separation(sched)  A  const  stent- acp  (sched))  A 
all- activities- scheduled  (activities }  sched)) 


3  (sched) 

(sched  G  psched  A  feasible(sched ,  activities)) 

However,  this  test  is  in  general  too  expensive,  com¬ 
putationally.  Instead,  we  derive  necessary  conditions 
for  it,  filters ,  i.e.: 

3  (sched) 

(sched  E  psched  A  feasible(sched,  activities)) 

=> 

'St  (sched,  psched)) 

The  next  step  consists  in  incorporating 

(sched,  psched))  into  psched,  i.e.: 

£( psched ) 

V(sched) 

(sched  G  psched  =>  (sched,  psched)) 

The  test  £  (psched)  holds  when  all  the  candidate  so¬ 
lutions  in  psched  satisfy  The  main  issue  is,  when  a 
given  psched  does  not  satisfy  £,  how  can  we  incorpo¬ 
rate  £  into  psched ?  The  answer  is  to  find  the  greatest 
refinement  of  psched,  psched  ,  that  satisfies  £. 

psched 

max3{qsched  \ 

psched  □  qsched  A  £(x,  qsched)} 

which  asserts  that  psched  is  maximal  over  the  set  of 
descriptors  that  refine  psched  and  satisfy  £,  with  re¬ 
spect  to  ordering  □.  We  want  psched  to  be  a  re¬ 
finement  of  psched  so  that  all  of  the  information  in 
psched  is  preserved  and  we  want  psched  to  be  max¬ 
imal  so  that  no  other  information  than  psched  and 
£  is  incorporated  into  psched.  The  refinement  rela¬ 
tion  pschedj  □  pschedi  holds  when  the  completions  of 
pschedi  are  a  subset  of  the  completions  of  pschedj. 

KIDS  instantiates  a  program  scheme  for  global 
search  with  constraint  propagation,  incorporating  £. 
For  more  detail  on  propagation  in  KIDS  see  [11]. 

The  challenge  in  order  to  take  advantage  of  the  prop¬ 
agation  mechanisms  provided  in  KIDS  lies  in  finding  £ 
—  even  though  KIDS  provides  a  tactic  to  synthesize 
propagation  code  incorporating  £,  the  derivation  of  £ 
is  not  a  straightforward  task  and  has  to  be  done  man¬ 
ually. 

In  the  case  of  the  outage  problem,  the  predicate 
Consistent- Activity-Separation(schedule)  states  that 
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all  the  activities  in  the  schedule  satisfy  the  precedence 
constraints.  The  derivation  of  cutting  constraints 
from  the  constraint  Consistent- Activity- Separation,  us¬ 
ing  the  formulas  for  calculating  $  (sched,  psched)  and 
the  test  £  (psched)  presented  above,  leads  to  the  well 
known  constraints  on  est  and  1st,  as  used  in  PERT.  The 
derivation  of  cutting  constraints  for  Consistent- ACP  is 
less  straightforward.  An  example  of  an  inferred  con¬ 
straint  from  Consistent- ACP  follows: 

V(i,tl,t2,act) 
i  £  domain(se(psched)) 

A  tl  =  se(psched)(i).time 
A  t2  =  se(psched)(i  +  1  ).time 
act  £  domain(p$ched) 

A  sacpll(tl,  psched) 

A  unav-  sources  (tl,psched)  =  T SAC  PL 
A  affects- avail- acps? (act,  psched) 

psched(act)  .1  ft  <  1 1 
V  psched(act)  .est  >  t2 

The  state  events  of  the  partial  schedule  considering 
the  definite  periods  of  activities  are  computed  by 
se(psched).  A  state  event  corresponds  to  any  event 
that  affects  the  state  of  the  plant.  The  time  of  the 
ith  state  event  of  the  partial  schedule  is  represented  by 
se(psched)(i).time,  the  predicate  sacpl?(t,$ched)  tests 
if  at  time  t  the  plant  is  in  a  state  of  AC  power  loss, 
unav-sources(t,  psched)  =  TSACPL  tests  if  at  time  t 
the  number  of  unavailable  AC  power  resources  equals 
the  threshold  for  AC  power  resource  unavailability  for  a 
state  of  AC  power  loss,  affect$-avail-acpsi(act,  psched) 
tests  if  the  activity  act  affects  an  available  AC  power 
resource,  and  psched(act).lft  and  psched(act).est  cor¬ 
respond  respectively  to  the  latest  finish  time  and  earli¬ 
est  start  time  of  the  activity  act  of  the  partial  schedule 
psched.  This  constraint  triggers  propagation  for  the 
activities  that  affect  available  AC  power  resources  — 
propagation  eliminates  from  the  activities’  time  win¬ 
dows  the  periods  that  overlap  the  intervals  that  corre¬ 
spond  to  a  state  of  AC  power  loss  with  number  of  un¬ 
available  AC  power  resources  equal  to  TSACPL  (the 
threshold).  In  other  words,  a  new  activity  that  affects 
available  AC  power  resources  cannot  occur  during  a  pe¬ 
riod  for  which  the  plant  is  operating  at  the  threshold 
regarding  the  AC  power  safety  function. 

In  this  paper  there  is  no  room  for  elaborating  on 
the  termination  conditions  for  propagation.  Never¬ 
theless,  briefly  we  point  out  that  such  guarantee  fol¬ 
lows  from  the  application  of  Tarki’s  fixpoint  theo¬ 
rem.  The  assumptions  required  for  the  theorem  are 


verified  in  the  global  search  theory  for  the  outage 
problem.  For  more  detailed  information  on  this  is¬ 
sue  see  e.g.,  [4,  11].  For  a  formal  description  of 
the  (manual)  derivation  of  the  constraints  incorporat¬ 
ing  the  test  £  for  Consistent-Activity-Separation  and 
Consistent- ACP  see  [5].  The  manual  derivation  of 
these  constraints  represented  more  than  60%  of  time 
of  the  design  of  the  outage  domain  theory. 

4.4.  Interaction  Between  The  Schedule  and  the  State 
of  Plant 

A  main  principle  embodied  in  our  approach  is  in¬ 
cremental  computation  -  propagation  illustrates  that 
concept  -  whenever  a  new  activity  is  scheduled,  all  con¬ 
straints  are  immediately  propagated  over  the  schedule. 
Finite  differencing  is  another  transformation  that  al¬ 
lows  for  incremental  computation,  by  efficiently  main¬ 
taining  the  state  of  plant.  Roughly,  the  idea  behind 
finite  differencing  is  to  incrementally  evaluate  an  ex¬ 
pensive  expression  in  a  loop,  rather  than  recomputing 
it  from  scratch  each  time.  As  an  example,  let  us  as¬ 
sume  that  function  f(x)  calls  function  g(x)  and  that 
x  changes  in  a  regular  way.  In  this  case,  it  might  be 
worthwhile  to  create  a  new  variable,  whose  value  is 
maintained  and  which  allows  for  incremental  computa¬ 
tion.  By  abstracting  function  /  with  respect  to  expres¬ 
sion  g(x)  a  new  parameter  c  is  added  to  /’s  parameter 
list  (now  f(x,  c))  and  c  =  g{x)  is  added  as  a  new  input 
invariant  to  /.  Any  call  to  /,  whether  a  recursive  call 
within  /  or  an  external  call,  must  now  be  changed  to 
supply  the  appropriate  new  argument  that  satisfies  the 
invariant  —  f(x)  is  changed  to  f(x,g(x)).  In  this  pro¬ 
cess  all  occurrences  of  g(x)  are  replaced  by  c.  Often, 
distributive  laws5  apply  to  g(h(x))  yielding  an  expres¬ 
sion  of  the  form  h,(g(x))  and  so  h*(c).  The  real  benefit 
in  the  optimization  comes  from  the  last  step,  because 
this  is  where  the  new  value  of  the  expression  g(h(x))  is 
computed  in  terms  of  the  old  value  of  g(x). 

In  the  outage  problem  there  are  several  opportuni¬ 
ties  for  finite  differencing  since  the  state  of  the  plant 
is  a  function  of  the  schedule  represented  by  the  con¬ 
straint  consistent-acp(schedule).  Figure  7  shows  the 
interactions  between  the  state  of  plant  and  the  sched¬ 
ule  -  when  a  new  activity  is  scheduled,  it  impacts  the 

5 Laws  are  assertions  that  define  axioms  or  theorems,  i.e., 
statements  that  are  always  true.  An  assertion  is  simply  a  true 
statement-  an  example  of  a  law  is  (A  +  B)*C  =  (A*C,)  +  (B*C>), 
or  (  A  and  B  -¥  A).  The  idea  is  to  provide  information  on  how 
to  distribute  predicates  and  functions  over  the  main  constructors 
of  the  variable  that  changes  in  a  regular  way,  exactly  in  the  same 
way  one  would  write  a  law  about  how  to  distribute  multiplication 
over  addition.  Additionally,  laws  also  specify  special  cases,  for 
instance  when  dealing  with  base  cases  (e.g.,  empty  sequences). 
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Figure  7.  Interaction  between  the  schedule 
and  the  state 


schedule  and  propagation  is  triggered.  Changes  in  the 
schedule  impact  the  state  of  the  plant,  which  is  incre¬ 
mentally  maintained  by  finite  differencing.  Changes  in 
the  state  impact  the  schedule  and  propagation  is  trig¬ 
gered,  which  impacts  the  schedule  and  so  on.  The  key 
issue  to  take  advantage  of  finite  differencing  is  to  pro¬ 
vide  good  laws  on  how  to  distribute  the  functions  to 
be  finite  differenced  over  the  main  constructors  of  the 
partial  schedule,  e.g.,  over  appending  an  activity  to  the 
schedule,  increasing  the  e$t  of  an  activity,  etc. 

5.  Performance  Results 
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Figure  8.  Time  performance 

The  current  version  of  ROMAN  was  completed  in 
November  1995,  and  it  has  been  demonstrated  to 
several  large  power  plants  such  as  American  Elec¬ 
tric  Power  Service,  Baltimore  Gas  &  Electric,  PECO 
Energy,  etc .  The  demonstration  was  successful,  and 
EPRI,  a  consortium  of  more  than  90%  of  the  utilities 
in  the  US,  is  looking  into  using  the  approach  embod¬ 
ied  in  ROMAN  to  build  the  next  generation  of  outage 
scheduling  tools  —  referred  to  as  Advanced  Technology 
Outage  Scheduler. 


ROMAN  has  proven  successful  since  it  clearly  ex¬ 
tends  the  current  functionality  offered  by  existing  soft¬ 
ware  tools  for  outage  management.  All  the  technolog¬ 
ical  constraints  currently  used  for  automatic  schedule 
generation  are  incorporated  into  the  system.  In  addi¬ 
tion,  ROMAN  produces  schedules  enforcing  safety  con¬ 
straints  —  AC  power  was  used  as  a  proof  of  concept. 

The  current  version  of  ROMAN  schedules  up  to 
2,000  activities  in  approximately  1  minute  on  a  Sparc 
2  (see  figure  8).  The  schedules  produced  by  ROMAN 
are  often  better  than  the  current  solutions  since  many 
new  possibilities  are  explored  compared  to  manual  so¬ 
lutions.  Human  schedulers  tend  to  aggregate  tasks  and 
schedule  them  as  blocks  rather  than  exploring  inter¬ 
esting  possibilities  that  occur  when  the  activities  are 
scheduled  separately. 

A  key  feature  of  ROMAN  that  utility  personnel  find 
attractive  is  the  robust  schedules  that  are  generated. 
The  current  scheduler  generates  a  schedule  that  in¬ 
cludes  start  time  windows  for  each  task.  Choosing  any 
start  time  within  the  window  for  a  task  still  permits 
feasible  execution  of  the  schedule.  The  window  pro¬ 
vides  information  about  how  critical  the  start  time  for 
a  task  is  -  if  a  predecessor  task  is  delayed,  a  user  can 
decide  whether  there  still  enough  freedom  in  the  start 
time  window  to  allow  on-time  completion,  or  whether 
it  is  time  to  reschedule  parts  of  the  overall  operation. 

ROMAN  currently  comes  configured  with  a  GUI 
that  displays  an  interactive  Gantt  chart  for  tasks, 
showing  their  start  time  window,  duration,  task  de¬ 
scription,  and  predecessors.  Another  Gantt  chart 
shows  the  history  of  the  state  of  the  plant  with  respect 
to  AC  power. 

6,  Conclusions  and  Future  Work 

ROMAN  has  successfully  demonstrated  that  outage 
schedules  that  satisfy  safety  constraints  can  be  quickly 
generated.  To  develop  ROMAN  into  a  practical  tool 
requires  (1)  handling  a  richer  model  of  the  outage  do¬ 
main  to  incorporate  other  safety  constraints,  and  (2) 
faster  code.  Future  work  includes  developing  better 
techniques  for  scaling  up  the  scheduler  through  better 
data  structures  and  search  strategies  that  are  better 
suited  to  the  problem  domain.  To  date  we  have  focused 
on  one  particular  safety  function  dealing  with  main¬ 
taining  adequate  sources  of  AC  power.  Future  work 
for  the  domain  of  outage  management  is  planned  to 
deal  with  other  critical  safety  constraints  and  schedul¬ 
ing  scarce  resources  such  as  heavy  lifts  and  skilled  per¬ 
sonnel. 
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